Xiru Lyu
Consulting for Statistics, Computing & Analytics Research (CSCAR), University of Michigan
2022-11-30
CSCAR provides individualized guidance and training to U-M researchers (faculty, staff, graduate students) on data collection, management, and analysis. The team also supports the use of statistical software and advanced computing.
We hold workshops on a variety of statistical topics.
CSCAR statisticians are available for hiring.
Email us with your stats questions!
For more information, visit https://cscar.research.umich.edu.
Today’s workshop was inspired by the R workshop taught by Chris Andrew.
Data types
Data structures
Conditional if/else statements
For/while loops
Functions
Statistical tools
Graphics
Extensibility
Reproducibility
FREE!
RStudio in an integrated development environment (IDE) for R.
Run the code
cmd + returnctrl + enterCreate a comment
cmd + shift + returnctrl + shift + enter2_basic_calculation.R
Exercise: How to compute logarithms with a different base?
1_sample_script.R
double: 2, 1.5
integer: 2L
character: "abc", "1"
logical: TRUE, FALSE
T, Fmissing value: NA
NA_real_, NA_integer_, NA_character_, NAMissing values tend to be infectious – most operations involving a missing value will return another missing value.
Of course, there are some exceptions..
atomic vectors
lists
Exercise: What if we mix data types within an atomic vector?
coercion
Data values are coerced in a fixed order:
character \(\leftarrow\) double \(\leftarrow\) integer \(\leftarrow\) logical
A matrix or an array is a vector with a dimension attribute.
A matrix has 2 dimensions; an array can have any number of dimensions
matrix(1:12, nrow = 4, ncol = 3)
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
matrix(1:12, nrow = 4)
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12
matrix(1:12, ncol = 3)
## [,1] [,2] [,3]
## [1,] 1 5 9
## [2,] 2 6 10
## [3,] 3 7 11
## [4,] 4 8 12Exercise: How to create a matrix with values filled by row instead?
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
[3,] 7 8 9
[4,] 10 11 12
3_objects.R
An object is an entity that contains information and can be manipulated by commands.
Use an object name that’s informative.
R is case sensitive.
Use - or . as separators in the object name.
The object name should not start with a number.
3_objects.R
In 3_objects.R, follow instructions and create the following objects:
vec with values 1, 5, 6, 9, 0mat with 3 columns and 4 rows, using integer values from -4 to 7. Values shall be filled by row.ls with the above objects vec and mat as its elementsa data frame named df with four named columns –
city: Ann Arbor, Boston, Atlantastate: MI, MA, GAlat: 42.278046, 42.361145, 33.753746lng: -83.738220, -71.057083, -84.386330class()
returns the (high-level) type of object
attributes()
returns the metadata (if any) associated with the object
vec
## [1] 1 5 6 9 0
# subset by index
vec[1]
## [1] 1
# assign the subsetted value to another object
sub_vec = vec[2]
sub_vec
## [1] 5
# subset elements in any order
vec[c(1,3)]
## [1] 1 6
vec[c(3,1)]
## [1] 6 1
# elements can be subsetted any number of times
vec[c(3,1,5,3,5)]
## [1] 6 1 0 6 0
# extract all elements in the vector except the second
vec[-2]
## [1] 1 6 9 0mat except the third column [,1] [,2]
[1,] -4 -3
[2,] -1 0
[3,] 2 3
[4,] 5 6
mat so the first row becomes the last, and the last row is the first [,1] [,2] [,3]
[1,] 5 6 7
[2,] -1 0 1
[3,] 2 3 4
[4,] -4 -3 -2
df
## city state lat lng
## 1 Ann Arbor MI 42.27805 -83.73822
## 2 Boston MA 42.36115 -71.05708
## 3 Atlanta GA 33.75375 -84.38633
df[,1]
## [1] "Ann Arbor" "Boston" "Atlanta"
df[2,1]
## [1] "Boston"
df$city
## [1] "Ann Arbor" "Boston" "Atlanta"
df[df$city == "Ann Arbor",]
## city state lat lng
## 1 Ann Arbor MI 42.27805 -83.73822
df$lat[df$city == "Ann Arbor"]
## [1] 42.27805which()which() function takes a logical statement as an argument, and returns indices for which the statement is true.if...else statementif...else is a conditional statement.if statement can be followed by an optional else statement.if...else statement (cont.)4_ifelse.R
if...else if...else statementExercise: Use the syntax provided on the left, write a if...else if...else statement that performs the following operation –
x > 0, print positive numberx = 0, print zerox < 0, print negative numberif...else if...else statementExercise: Use the syntax provided on the left, write an if...else if...else statement that performs the following operation –
x > 0, print positive numberx = 0, print zerox < 0, print negative numberfor loops5_for_while_loops.R
for loops5_for_while_loops.R
while loops5_for_while_loops.R
6_functions.R
Let’s try to write a function that converts Fahrenheit to Celsius using the formula \[ C = \frac{5}{9}(F - 32)\]
Exercise
celsius_to_kelvin() that converts Celsius to Kelvin, using the formla \[K = C + 273.15\]apply()6_functions.R
apply() function performs a function on each row or column of a matrix/data frame.apply()add_first_last() that computes the sum of a vector’s first and last elements.m of the form [,1] [,2] [,3]
[1,] 9 4 7
[2,] 14 47 74
[3,] 5 0 1
[4,] 6 10 19
apply() (cont.)m.